You are viewing the RapidMiner Studio documentation for version 10.2 - Check here for latest version
 Sample Collection
						(Operator Toolbox)
Sample Collection
						(Operator Toolbox)
					
        
        Synopsis
A collection is a list of items. This operator allows you to take a collection and sample it to a given sample size.Description
The operator provides 3 different sampling methods (see parameter description) to perform the sampling. The parameter sample_size describes the number of items in the sampled output collection.Input
 exa (Collection) exa (Collection)- The collection which should be sampled. 
Output
 col (Collection) col (Collection)- The sampled collection. 
 org (Collection) org (Collection)- The original collection. 
Parameters
- sampling_method
				The method to use for sampling.
				- linear sampling: Take the first n objects of the collection.
- shuffled sampling: Take n unique, but random objects of the collection.
- bootstrap sampling: Take n random objects of the collection. Objects are allowed to be taken several times.
 
- sample_size The number of objects to be drawn. Range:
- use_local_random_seed This parameter indicates if a local random seed should be used. Range:
- local_random_seed If the use local random seed parameter is checked this parameter determines the local random seed. Range:
Tutorial Processes
Grouping an ExampleSet into a collection
In this process we group the Titanic data set into bins of passenger fare. Then we select 2 random price ranges.
